Romanian Linguistic Resources On Very Large Scale
نویسنده
چکیده
This paper suggests a methodology for building a technological environment for linguistic processing, intended to conserve, update and exploit, for research, for public and for commercial purposes, strategic linguistic resources of the Romanian language, rooted in textual data contributed daily and in the long run by important editorial houses and mass-media institutions. In essence, it describes a technology able to receive, store and continuously process large amounts of textual data, received from voluntary contributors, on a daily basis. Apart from storing linguistic data à la longue for the benefit of preserving the language, the results of the processing will be returned to three categories of users: the researchers working on Romanian language and computational linguistics, the contributors of the resources, and the public at large. Such an initiative is motivated by the growing needs for linguistic resources, including textual data and processing tools, which are manifested in social sciences and humanities, and which should bring the Romanian language1, now still less-resourced, to the level of technologically-rich languages of Europe. Raising the quantity of resources dedicated to different languages was a constant preoccupation in Europe over the past 15 years2, triggered by the necessity to boost
منابع مشابه
Linguistic Resources and Technologies for Romanian Language
This paper revises notions related to Language Resources and Technologies (LRT), including a brief overview of some resources developed worldwide and with a special focus on Romanian language. It then describes a joined Romanian, Moldavian, English initiative aimed at developing electronically coded resources for Romanian language, tools for their maintenance and usage, as well as for the creat...
متن کاملOptimal Romanian clitics:A cross-linguistic perspective*
Comparative Issues in Romanian Syntax held at the University of New Brunswick, Saint John, Canada; at the 1996 Going Romance conference held in Utrecht, the Netherlands; at the 1997 Linguistic Symposium on Romance Languages held at UC Irvine, and at the 1997 Hopkins Optimality Theory Workshop & University of Maryland Mayfest in Baltimore. I would like to thank audiences at these meetings for th...
متن کاملA Generic Platform for Developing Language Resources and Applications
The paper describes a unification-based language engineering platform meant for development of reversible language resources and linguistic applications. The platform, called EGLU (Environnment Generique Linguistique d’Unification) is an enhanced generalized port of ISSCO’s original ELU from SUN-OS Allegro Common Lisp to Macintosh Common Lisp and Carnegie Mellon Lisp (under Solaris). Several la...
متن کاملMULTEXT-East Version 4: Multilingual Morphosyntactic Specifications, Lexicons and Corpora
The paper presents the fourth, “Mondilex” edition of the MULTEXT-East language resources, a multilingual dataset for language engineering research and development, focused on the morphosyntactic level of linguistic description. This standardised and linked set of resources covers a large number of mainly Central and Eastern European languages and includes the EAGLES-based morphosyntactic specif...
متن کاملIntegration of Large-Scale Linguistic Resources in a Natural Language Understanding System
Knowledge acquisition is a serious bottleneck for natural language understanding systems. For this reason, large-scale linguistic resources have been compiled and made available by organizations such as the Linguistic Data Consortium (Comlex) and Princeton University (WordNet). Systems making use of these resources can greatly accelerate the development process by avoiding the need for the deve...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- The Computer Science Journal of Moldova
دوره 19 شماره
صفحات -
تاریخ انتشار 2011